Improving Compressed Counting

نویسنده

  • Ping Li
چکیده

Compressed Counting (CC) [22] was recently proposed for estimating the αth frequency moments of data streams, where 0 < α ≤ 2. CC can be used for estimating Shannon entropy, which can be approximated by certain functions of the αth frequency moments as α → 1. Monitoring Shannon entropy for anomaly detection (e.g., DDoS attacks) in large networks is an important task. This paper presents a new algorithm for improving CC. The improvement is most substantial when α → 1−. For example, when α = 0.99, the new algorithm reduces the estimation variance roughly by 100-fold. This new algorithm would make CC considerably more practical for estimating Shannon entropy. Furthermore, the new algorithm is statistically optimal when α = 0.5.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse Recovery with Very Sparse Compressed Counting

Compressed1 sensing (sparse signal recovery) often encounters nonnegative data (e.g., images). Recently [11] developed the methodology of using (dense) Compressed Counting for recovering nonnegative Ksparse signals. In this paper, we adopt very sparse Compressed Counting for nonnegative signal recovery. Our design matrix is sampled from a maximally-skewed α-stable distribution (0 < α < 1), and ...

متن کامل

Optimal Trade-Offs for Succinct String Indexes

Let s be a string whose symbols are solely available through access(i), a read-only operation that probes s and returns the symbol at position i in s. Many compressed data structures for strings, trees, and graphs, require two kinds of queries on s: select(c, j), returning the position in s containing the jth occurrence of c, and rank(c, p), counting how many occurrences of c are found in the f...

متن کامل

The Optimal Quantile Estimator for Compressed Counting

Abstract Compressed Counting (CC) was recently proposed for very efficiently computing the (approximate) αth frequency moments of data streams, where 0 < α ≤ 2. Several estimators were reported including the geometric mean estimator, the harmonic mean estimator, the optimal power estimator, etc. The geometric mean estimator is particularly interesting for theoretical purposes. For example, when...

متن کامل

Post-Operative Time Effects after Sciatic Nerve Crush on the Number of Alpha Motoneurons, Using a Sterological Counting Method (Disector)

There are extensive evidences that show axonal processes of the nervous system (peripheral and/or central) may be degenerated after nerve injuries. Wallerian degeneration and chromatolysis are the most conspicuous phenomena that occur in response to injuries. In this research, the effects of post-operative time following sciatic nerve crush on the number of spinal motoneurons were investigated....

متن کامل

Compressed counting

Abstract Counting is a fundamental operation. For example, counting the αth frequency moment, F(α) = ∑D i=1 At[i] , of a streaming signal At (where t denotes time), has been an active area of research, in theoretical computer science, databases, and data mining. When α = 1, the task (i.e., counting the sum) can be accomplished using a counter. When α 6= 1, however, it becomes non-trivial to des...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009